Learning Distance Functions with Product Space Boosting

نویسندگان

  • Tomer Hertz
  • Aharon Bar-Hillel
  • Noam Shental
چکیده

A good distance function is an essential tool in applications which involve querying large databases, such as image retrieval and bioinformatics. We describe a non-parametric algorithm for distance function learning which is based on the boosting of low grade weak learners in a product space. The algorithm learns a function defined over pairs of points, using supervision in the form of equivalence constraints. The weak learners are based on partitioning the original feature space, using a generic density estimation generative model (GMM) augmented by equivalence constraints on pairs of datapoints. Using a number of databases from the UCI repository, we show significantly improved results over methods which learn the parametric Mahalanobis distance. We also show initial results of image retrieval, using a large database of facial images (YaleB).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiclass Semi-supervised Boosting Using Different Distance Metrics

The goal of this thesis project is to build an effective multiclass classifier which can be trained with a small amount of labeled data and a large pool of unlabeled data by applying semi-supervised learning in a boosting framework. Boosting refers to a general method of producing a very accurate classifier by combining rough and moderately inaccurate classifiers. It has attracted a significant...

متن کامل

Feature Selection in Distance Learning from Small Sample

Learning from a small sample is an acute problem which arises in many applications where acquiring new samples is difficult, time consuming or expensive. The problem becomes even harder when dealing with rich high dimensional data. The learning process in such cases is often preceded by dimensionality reduction or feature selection. The need to avoid overfitting of an algorithm to the data is c...

متن کامل

یادگیری نیمه نظارتی کرنل مرکب با استفاده از تکنیک‌های یادگیری معیار فاصله

Distance metric has a key role in many machine learning and computer vision algorithms so that choosing an appropriate distance metric has a direct effect on the performance of such algorithms. Recently, distance metric learning using labeled data or other available supervisory information has become a very active research area in machine learning applications. Studies in this area have shown t...

متن کامل

Generalized Dictionary for Multitask Learning with Boosting

While multitask learning has been extensively studied, most existing methods rely on linear models (e.g. linear regression, logistic regression), which may fail in dealing with more general (nonlinear) problems. In this paper, we present a new approach that combines dictionary learning with gradient boosting to achieve multitask learning with general (nonlinear) basis functions. Specifically, f...

متن کامل

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003